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Description 

BACKGROUND OF THE INVENTION 

5 [0001 ] The present invention relates to recombinant DNA which encodes the Bpm\ restriction endonuclease as well 
as Bprri methyltransferase and expression of BprrA restriction endonuclease from E. coii cells containing the recom- 
binant DNA. Bpm\ is an isoschizomer of GstA (Fermentas 2000-2001 Catalog, Product No. ER0461/ER0462). 
[0002] Type II restriction endonucleases are a class of enzymes that occur naturally in bacteria and in some viruses. 
When they are purified away from other bacterial proteins, restriction endonucleases can be used in the laboratory to 

10 cleave DNA molecules into small fragments for molecular cloning and gene characterization. 

[0003] Restriction endonucleases act by recognizing and binding to particular sequences of nucleotides (the 'recog- 
nition sequence') along the DNA molecule. Once bound, they cleave the molecule within, to one side of, or to both 
sides of the recognition sequence. Different restriction endonucleases have affinity for different recognition sequences. 
Over two hundred and eleven restriction endonucleases with unique specificities have been identified among the many 

15 hundreds of bacterial species that have been examined to date (Roberts and Macelis, Nud. Adds Res. 27:312-313 
(1999)). 

[0004] Restriction endonucleases typically are named according to the bacteria from which they are derived. Thus, 
the species Deinococcus radiopNtus for example, produces three different restriction endonucleases, named Dra\, 
DraU and Dra\ II . These enzymes recognize and cleave the sequences 5TTT/AAA3\ 5'PuG/GNCCPy3' and 5'CACNNN/ 
20 GTG3' respectively. Escherichia coii RY13, on the other hand, produces only one enzyme, EcoRI, which recognizes 
the sequence 5'G/AATTC3\ 

[0005] A second component of bacterial restriction-modification (R-M) systems are the methyttransf erases (methy- 
lases). These enzymes are complementary to restriction endonucleases and they provide the means by which bacteria 
are able to protect their own DNA and distinguish it from foreign, infecting DNA. Modification methylases recognize 

25 and bind to the same recognition sequence as the corresponding restriction endonuclease, but instead of cleaving the 
DNA, they chemically modify one particular nucleotide within the sequence by the addition of a methyl group (C5-methyl 
cytosfne, N4-methyl cytosine, or N6 methyl adenine). Following methylation, the recognition sequence is no longer 
cleaved by the cognate restriction endonudease. The DNA of a bacterial cell is always fully modified by the activity of 
Its modification methytase. It is therefore completely insensitive to the presence of the endogenous restriction endo- 

so nuclease. It is only unmodified, and therefore identifiably foreign DNA, that is sensitive to restriction endonuclease 
recognition and deavage. 

[0006] By means of re combi na nt DNA technology, it is now possible to clone genes and overproduce the enzymes 
in large quantities. The key to isolating clones of restriction endonuclease genes is to develop a simple and reliable 
method to identify such clones within complex genomic DNA libraries, i.e. populations of clones derived by 'shotgun' 
35 procedures, when they occur at frequencies as low as 10" 3 to Preferably, the method should be selective, such 
that the unwanted majority of clones are destroyed while the desirable rare clones survive. 

[0007] A large number of type II restrictk>n-rrK)dification systems have been cloned. The first cloning method used 
bacteriophage infection as a means of identifying or selecting restriction endonuclease clones (EcoRI I: Kosykh et al., 
Mot. Gen. Genet 178:717-719 (1980); Hha\\: Mann et al., Gene 3:97-112 (1978); PsU: Wakter et al., Proc. Nat. Acad. 

^0 Set. 78:1503-1507 (1981)). Since the presence of restnctk>n -modification systems in bacteria enable them to resist 
infection by bacteriophage, cells that carry cloned restrictior>-rr>odfficaiion genes can, in principle, be selectively isolated 
as survivors from genomic DNA libraries that have been exposed to phages. This method has been found, however, 
to have only limited value. Specifically, it has been found that cloned restriction-modiTication genes do not always 
manifest sufficient phage resistance to confer selective survival. 

45 [0008] Another cloning approach involves transferring systems initially characterized as plasmid-bome into E. coii 
cloning plasmids (EcoRV: Bougueleret et al., Nud. Acids. Res. 12:3659-3676 (1984); PaeH7: Gingeras and Brooks, 
Proc. Natl. Acad. Sd. USA 80:402^406 (1983); Theriault and Roy, Gene 19:355-359 (1982); PvuU: Blumenthal et al., 
J. Bacterid. 164:501-509 (1985); TspA5\: Wayne et al. Gene 202:83-88 (1997)). 

[0009] A third approach is to select for active expression of methylase genes (methyl ase selection) (U.S. Patent No. 

50 5,200,333 and BsuRI: Kiss et al., Nud. Acids. Res. 13:6403-6421 (1985)). Since R-M genes are often closely linked 
together, both genes can often be cloned simultaneously. This selection does not always yield a complete restriction 
system however, but instead yields only the methylase gene (BspRI: Szomolanyi et al., Gene 1 0:21 9-225 (1980); Bcn\: 
Janulaitis et al., Gene 20:1 97-204 (1982); BsuRI: Kiss andBaldauf, Gene21 :11 1-1 19 (1983); and Mspi: Walderet al., 
J. Bid. Chem. 258:1235-1241 (1983)). 

55 [0010] A more recent method, the "en do-blue method", has been described for direct cloning of restriction endonu- 
clease genes in E cdf based on the indicator strain of E. cdi containing the dm£>.:lacZ fusion (Fomenkov et al., U.S. 
Patent No. 5,498,535, (1996); Fomenkov et al., Nud. Acids Res. 22:2399-2403 (1994)). This method utilizes the E. 
cdi SOS response signals following DNA damages caused by restriction endonucleases or non-specific nucleases. A 
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number of thermostable nuclease genes (Tacfi, TtfrHH, BscB\ t 77 nuclease) have been cloned by this method (U.S. 
Patent No. 5,498,535). 

[001 1 ] Because purified restriction endonud eases, and to a lesser extent, modification methylases, are useful toots 
for creating recombinant molecules in the laboratory, there is a commercial incentive to obtain bacterial strains through 
5 recombinant DNA techniques that produce large quantities of restriction enzymes. Such overexpression strains should 
also simplify the task of enzyme purification. 

SUMMARY OF THE INVENTION 

10 [001 2] The present invention relates to a method for cloning the Bpm\ restriction endonud ease from Bacillus pumifus 
into E.coH by methylase selection and inverse PCR amplification of the adjacent DNA of the Bpni methylase gene. 
[0013] The present invention relates to recombinant Bpm\ and methods for produdng the same. Bprrfi restriction 
endonuclease is found in the strain of Bacillus pumifus (New England Biolabs 1 strain collection #711). It recognizes 
doublestranded DNA sequence 5' CTGGAG 3' (or 5'CTCCAG3') and cleaves 1 6/1 4 bases downstream of its recognition 

« sequence (N16/N14) to generate a 2-base 3' overhanging ends. 

[0014] By methylase selection, a methylase gene with high homology to amino-methyltransferases (N6-adenine 
methylases) was found in a DNA library. This gene was named Bpm\ M1 gene (BpmlMI, 1 650 bp), encoding a 549- aa 
protein with predicted molecular mass of 63,702 daftons. There was one partial open reading frame upstream of 
BpmlMI gene that displayed 31% amino acid sequence identity to another restriction enzyme Eco57l with similar 

20 recognition sequence (Bco57l recognition sequence: 5*CTGAAG N16/N14; Bprri recognition sequence: 5' CTGGAG 
N16/N14; A. Janulartis et al. Nud. Acids Res. 20:6051-6056, (1992)). 

[0015] In orderto clone the rest of the BpmIRM gene, inverse PCR was used to amplify the adjacent DNA sequence. 
After four rounds of inverse PCR reactions, an open reading frame of 3030 bp was found upstream of Bpm\ M1 meth- 
ylase gene, which encodes a 1009-aa protein with predicted molecular mass of 116,891 daltons. By amino acid se- 

25 quence comparison of Bpni endonuclease with all known proteins in GenBank protein database, it was discovered 
that Bprri endon udease is a fusion of two distinct elements with a possible structural domains of restriction-methylation- 
specrficfty (R-M-S). This domain organization is analogous to the type I restriction-modification system with three distinct 
subunits R, M, and S. Because Bpni is quite distinct to other type lis restriction enzymes, it is proposed that Bpni 
belongs to a subgroup of type II restriction enzymes called type I If (f stands for fusion of restricttorvmodrftation domains) 

30 [0016] To generate a prerrtodffied expression host, the BpmlMI gene was amplified in PCR and doned in E. cod 
strain ER2566. Bpni M1 methytese aiso modifies Xhd site. Xhci recognition sequence 5' CTCGAG 3' is similar to 
Bprri recognition sequence 5* CTGGAG 3* with only one base difference. It was conduded that Bpni M1 methylase 
may recognize the sequence 5* CTNNAG 3* and possibly modify the adenine base to create N6-adenine in the sym- 
metric sequence. 

35 [0017] The expression of 3030-bp BpmIRM gene was quite difficult because of the large size of the PCR porduct. 
The BpmIRM gene was first amplified by Taq DNA polymerase and doned into the premodified host, but no Bpni 
activity was detected. To improve me fidelity of PCR reaction, Deep Vent DNA polymerase was used in PCR. Among 
1 8 clones with the insert, only one done (#4) displayed partial Bpni activity. This done was sequenced and confirmed 
to contain wild type sequence. 

40 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0018] 

45 Figure 1 Gene organization of Bpni restriction-modification system. Genes BpmIRM and BpmlMI code for 

Bpni endonudease (Bpni endon uciease-rnethylase fusion protein and Bpni M1 , respectively. fl^m»-A#1 , Bpm\- 
A#2, and Bpm\-A*3 are deletion mutants with deletions in the methylation or specificity domains. 

Figure 2 DNA sequence of Bpni M1 methylase gene (BpmlMI) (SEQ ID NO:1) and its encoded amino acid 
so sequence (SEQ ID NO:2). 

Figure 3 DNA sequence of fl|pm1 endonudease gene (BpmIRM) (SEQ ID NO:3) and its encoded amino acid 
sequence (SEQ ID NO:4). 

55 Figure 4 Recombinant Bpni endonudease activity in column fractions following heperin Sepharose chroma- 

tography. Lane 1 : purified native Bpni endonudease; lanes 2 to 23: heperin Sepharose column fractions. Fractions 
1 1 to 1 4 gave rise to complete Bpni digestion of X DNA. The remaining fractions contain no or partial Bpni activity. 
Lane 24: 1 kb DNA size marker. 
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[0019] The method described herein by which the BpnA methylase gene and the BpnA restriction endonuclease 
genes are preferably cloned and expressed in E co// employ the following steps: 

5 

1 . Preparation of genomic DNA and restriction digestion of genomic DNA. 

[0020] Genomic DNA is prepared from Bacillus pumifus (New England Biolabs collection #711) by the standard 
procedure. Five u,g genomic DNA is digested partially with 2, 1 , 0.5, and 0.25 units of Ap<A (recognition sequence FV 
10 AATTY). Genomic DNA fragments in the range of 2-10 kb are purified through a low-melting agarose get. Genomic 
and pBR322 DNA are also digested with Aafll, BamHI, Clal, Eaol, EcoHl, HM\\\, Nde\ t Nhe\, Sat, and SptA, respec- 
tively, however, no methylase positive clones were obtained. 

2. Construction of Apoi partial genomic DNA library and challenge of library with BpnA. 

15 

[0021] The Apo\ partial DNA fragments are ligated to EcoR\ digested and CIP treated pBR322 vector. The ligated 
DNA is transferred into E. co//RR1 competent cells by electroporation. Transformants are pooled and amplified. Plasm id 
DNA is prepared from the cells and challenged with Bpm\. Following Bpm\ digestion, the challenged DNA is transformed 
into RR1 cells. Survivors are screened for resistance to BpnA digestion. Two resistant clones, #18 and #26, were 
20 identified to be resistant to BpnA digestion. Aafll, BamHI, C/ai, Eaol , EcoRI, Hindi II, Nde\, Nhe\, SaA, and SphU digested 
genomic DNA were also ligated to pBR322 with compatible ends and genomic DNA libraries are constructed. However, 
no apparent BpnA resistant clones were discovered from these libraries. 

3. Subclonlng and DNA sequencing of the resistant done. 

25 

[0022] One resistant done #26 contained an insert of about 3.1 kb. The forward and reverse primers of pUC1 9 were 
used to sequence the insert. Three ApcA and one HtxAW fragments were subdoned in pUC1 9 and sequenced. The 
entire insert was sequenced by primer waJking. A methylase gene with high homology to ami no-methyttransf erase is 
found within the insert which is name BpnA M1 gene. The BpmlMI gene is 1 ,650 bp, encoding a 549-amino add protein 
30 with predated mofecutor mass of 63,702 dertons. 

4. Cloning of BpnA restriction endonuclease gene (BpmIRM) by Inverse PCR. 

[0023] In accordance with the present invention, it was determined that there was one partial open reading frame 
35 upstream of BpmlMI gene that has 31 % amino acid sequence identity to another restriction enzyme Eco57l with similar 
recognition sequence (Ecc67l recognition sequence: 5'CTGAAG N16/N14; A. Janulaitis et al. Nud. Acids Res. 20: 
6051-6056 (1992); BpnA recognition sequence: 5TJTGGAG N16VN14). Genomic DNA is digested with restriction en- 
zymes. The digested DNA is ligated at a low DNA concentration and then used for inverse PCR amplification of BpmIR 
gene. Inverse PCR products are derived, gel-purified from low-melting agarose and sequenced. After four rounds of 
inverse PCR reactions, an open reading frame of 3,030 bp was found upstream of BpnA M1 methylase gene, which 
encoded a 1,009-amino acid protein with predicted molecular mass of 116,891 daitons. This is one of the largest 
restriction enzyme discovered so far. By amino add sequence comparison of BpnA endonuclease with all known pro- 
teins in Gen Bank protein database, it is discovered that BpnA endonuclease is a fusion of two distinct elements with 
a possible structural domains of restnctwn-methyiation-specificfty (R-M-S). This domain organization is analogous to 
45 the type I restriction-modification system with three distinct sub units, restriction, methytation, and specificity (R, M, and 
S). Because BpnA is quite distinct to other type lis restriction enzymes, it is suggested that BpnA belongs to a subgroup 
of type II restriction enzymes called type llf (f stands for fusion of restridion-modffication^pecificity domains) 

5. Expression of BpmlMI gene in £. coll. 

50 

[0024] Two primers are synthesized to amplify BpmlMI gene in PCR. Following digestion with BamHI and Sph\, the 
PCR product is ligated into pACYCI 84 with the compatible ends. The ligated DNA is transformed into ER2566 com- 
petent cells. Plasm ids with BpmlMI gene inserts are tested for resistance to BpnA digestion. Two out of 18 dones 
were found to be resistant to BpnA digestion, indicating efficient BpnA M1 expression in E. coli cells and BpnA site 
« modification on the expression plasmid. The host ceil ER2566 [pACYC-BpmlM1] is used for expression of BpmIRM 
gene. 

[0025] BpnA M1 methylase also modifies Xho\ site. Xhd recognition sequence 5'CTCGAG3' is similar to BpnA rec- 
ognition sequence 5*CTGGAG3' with onfy one base difference. It is concluded that BpnA M 1 methylase may recognize 
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the sequence 5'CTNNAG3' and modify the adenine base to generate N6-adenine in the symmetric sequence. 

6. Expression of BpmIRM gene In £ coH using a T7 expression vector. 

3 [0026] The 3,030-bp BpmIRM gene was amplified in PCR using Taq DNA polymerase, digested with BamH\ and 
ligated into Ba/nH I -digested T7 expression vectors pAIH 7 and pET21a. After transformation of the ligated DNA into 
ER2566 [pACYC-BpmlM1], transformants were screened for the endonuc lease gene insert. Seven out of 72 clones 
contained the insert with correct orientation. However, no Bpm\ activity was detected in cell extracts of IPTG-induced 
cells. This is probably due to mutations introduced during the PCR process. 

10 [0027] To reduce the mutation frequency, Deep Vent® DNA polymerase was used in PCR reactions to amplify the 
3030-bp BpmIRM gene. The PCR product was digested with SasnHI and Xba\ and ligated to T7 expression vectors 
pAI1 1 7 and pET21 at. Eighteen out of 36 clones contain the correct size insert. Ten ml cell culture for all 1 8 clones were 
induced with IPTG and cell extracts were prepared and assayed for Bpm\ activity. Clone #4 displayed partial Bpni 
activity. 

15 

7. Partial purification of recombinant Bpm\ activity. 

[0028] Five hundred ml of cell culture was made for the expression clone #4 ER2566 fpACYC-BpmlMI , pET21 at- 
BpmIRM]. Cell extract (40 ml) containing Bpni was purified through a heparin Sep h arose column. Proteins were eluted 

20 with a Nad gradient of 50 mM to 1 M. Fractions 6 to 27 are assayed for Bpm\ activity on X DNA. It was found that 
fractions 15 to 18 contained the most active Bpm\ activity (Figure 4). The yield was estimated at 1 ,800 units of Bpni 
per gram of wet £ caff eel Is. The specific activity was estimated at 24,000 units per mg of protein. 
[0029] The present invention is further illustrated by the following Examples. These Examples are provided to aid in 
the understanding of the invention and are not construed as a limitation thereof. 

25 [0030] The references cited above and below are hereby incorporated by reference herein. 

EXAMPLE 1 

CLONINQ OF Bpmi RESTRCllON-MODIRCMIOM SYSTEM IN E COU 

30 

1. Preparation of genom ic DMA and restriction digestion of genomic DNA. 

[0031] Genomic DNA is prepared from Bacillus pumifus (New England Biolabs collection #711) by the standard 
procedure consisting the following steps: 

35 

(a) cell lysis by addition of r/sozyme (2 mg/ml final), sucrose (1% final), and 50 mM Tris-HCI, pH 8.0; 

(b) cell lysis by addition of 10% SDS (final concentration 0.1%); 

40 ( C ) cell lysis by addition of 1% Triton X-100 and 62 mM EDTA, 50 mM Tris-HCI, pH 8.0; 

(d) pheno»-CHCl3 extraction of DNA 3 times (equal volume) and CHCI3 extraction one time; 

(e) DNA dialysis in 4 liters of TE buffer, change 3x; and 

45 

(f) RNA was removed by RNAse A treatment and the genomic DNA was precipitated in ethanol and resupended 
in TE buffer; 

[0032] Five u.g genomic DNA was digested partially with 2, 1 , 0.5, and 0.25 units of Apd (recognition sequence R/ 
so AATTY) at 50° C for 30 min. Genomic DNA fragments in the range of 2-1 0 kb were purified through a 1% low-melting 
agarose gel. Genomic and pBR322 DNA were also digested with Aaffl, BaroHl, Cfel, Eagfi, EcoH\, HindlU, Nde\, Nhe\, 
SaA, and Spti, respectively. Genomic DNA fragments were ligated to pBR322 with compatible ends. 

2. Construction of Apdk partial genomic DNA library and challenge of library with Bpmi. 

55 

[0033] The ApcA partial DNA fragments were ligated to EcoRI digested and CIP treated pBR322 vector. The ligated 
DNA was dialyzed by drop dialysis on 4 L of distilled water and transferred into £ co// RR1 competent cells by elec- 
troporation. Ap° transformants were pooled and amplified. Plasmid DNA was prepared from the overnight cells and 
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challenged with Bprri. Following Bprri digestion, the challenged DNA was transformed into RR1 cells. Ap R survivors 
were screened for resistance to Bprri digestion. A total of 36 plasmid mini-preparations were made. Two resistant 
clones, #18 and #26, were identified to be resistant to Bprri digestion. Aafll, BarrH\ t C/al, Eagl, EcoRI, H/ndlll, Nde\, 
Nhd, SaA, and Spri digested genomic DNA were also ligated to pBR322 with compatible ends and genomic DNA 
5 libraries were constructed. However, no apparent Bprri resistant clones were discovered from these libraries after 
screening more than 144 clones. 

3. Subcfonlng and DNA sequencing of the resistant done. 

10 [0034] One resistant done #26 contains an insert of about 3.1 kb. The forward and reverse primers of pUC19 were 
used to sequence the insert. Three Apd and one Hindi 1 1 fragments were gel-purified and subcloned in pUC19 and 
sequenced. The rest of the insert was sequenced by primer walking. A methylase gene with high homology to amino- 
methyttransf erase (N6-adenine methylase) was found within the insert which was name Bprri M1 gene. The BpmlMI 
gene is 1 ,650 bp, encoding a 549 -amino acid protein with predicted molecular mass of 63,702 daltons. 

15 

4. Cloning of Bprri restriction endonuclease gene (BpmIRM) by inverse PCR. 

[0035] There is one partial open reading frame upstream of BpmlMI gene that has 31 % amino acid sequence identity 
to another restriction enzyme Eco57l with similar recognition sequence (Ecc67l recognition sequence: 5'CTGAAG 
20 N16/N1 4; A. Janulaitis et al. Nud. Acids Res. 20:6051 -6056 (1 992); Bprri recognition sequence: 5'CTGGAG N16/N14). 
Genomic DNA was digested with restriction enzymes Ase\, Bch, HaeU, HpaU, Mbd, Msd, Mai 1 1, Pad, and Tsp609l. 
The digested DNA was ligated at a low DNA concentration at 2 ug/ml and then used for inverse PCR amplification of 
BpmtR gene. The sequence of the inverse PCR primers was the following: 

25 

5' gtggaaacggaccgtattatggtt 3' (232-34) (SEQ ID NO: 5) 



30 5' caceagtaaataacaggttattcc 3' (232-35) (SEQ ID NO:6) 



[0036] Inverse PCR conditions were 94° C 1 min, 55°C 1 min, 72°C 2 min for 35 cycles. Inverse PCR products were 
derived from Had 1 1 and Nla\\\ templates, gel-purified from low-melting agarose and sequenced using primers 232-34 
35 and 35. 

[0037] The primers for second round of inverse PCR were the following: 



5' ttcgtagcaagtacggtccatatcagt 3' (233-76) (SEQ ID NO: 7) 



5' ccgtatgtacttgataggaataacctg 3' (233-77) (SEQ ID NO: 8) 

45 

[0038] Genomic DNA was digested with Ase\, Bet, BsrF\, BslNI, EcoRI, H/rcll, HSndlll, HpaU, Ned, Pad, Pvd, Ta#, 
Tfll, and Xba\. The digested DNA was ligated at a low DNA concentration at 2 u,g/ml and then used for inverse PCR 
amplification of BpmIR gene. Inverse PCR conditions were 94°C 1 min, 55°C 1 min, 72° C 2 min for 35 cycles. Inverse 
PCR products were derived from Asd t Hin6U\, HpaU, and 7agl templates, gel-purified from low-melting agarose and 
so sequenced using primers 233-76 and 77. 

[0039] The primers for third round of inverse PCR were the following: 



5' aggaactaagaaagttcatagctg 3' (234-61) (SEQ ID NO:9) 
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5' atgcggtattatataacccaacag 3' (234-62) (SEQ ID NO: 10) 



[0040] Genomic ONA was digested with AMU, BspH\, fisJNI, EcoR\, Haeil, HbdPII, Hhai\, HindM, Sty\, and Xmri. 
The digested DNA was ligated at a low DNA concentration at 2 ug/ml and then used for inverse PCR amplification of 
BpmlRgene. Inverse PCR conditions were 94°C 1 min, 55°C 1 min, 72°C 2 min for 35 cycles. Inverse PCR products 
were derived from HmPM and XmrA templates, gel-purified from low-melting agarose and sequenced using primers 
234-61 and 62. 

[0041] The primers for the fourth round of inverse PCR were the following: 



5' tgacgtcctcttcacctaattcgg 3' (235-50) (SEQ ID NO: 11) 



5' gagtttgtgaagatagaaccattg 3' (235-51) (SEQ ID NO: 12) 

[0042] Genomic DNA was digested with Ape*, BslBI, BsrYl, C/al, EcoRI, Ndei, RsaA, SaU3A\, Sspi, Taq\, and Xmr*. 
The digested DNA was ligated at a low DNA concentration at 2 ug/ml and then used for inverse PCR amplification of 
BpmIR gene. Inverse PCR conditions were 94°C 1 min, 56°C 1 min, 72°C 2 min for 35 cycles. Inverse PCR products 
were derived from Apck, Cta\, Ndel, Rsa\ t Sspl, and Taql templates, gel-purified from low-melting agarose and se- 
quenced using primers 235-50 and 51. The Cfel fragment (2.4 kb) further extends upstream of BpmtRM gene. The 
rest of the Ctei fragment was sequenced using primer walking. 

[0043] After four rounds of inverse PCR reactions, an open reading frame of 3,030 bp was found upstream of Bpni 
M1 methylase gene, which encodes a 1 ,009-amino acid protein with predicted molecular mass of 116,891 daltons. 
This is one of the largest restriction enzyme discovered so far. By amino actd sequence comparison of Bpm\ endonu- 
dease with all known proteins in Gen Bank protein database, it was discovered that Bpni en do nuclease is a fusion of 
two distinct elements win a possible structural domains of restiiction-methylatton-specifkaty (R-M-S). This domain 
organization is analogous to the type I reetnctk>n-modification system with three distinct subunits, restriction, methyl* 
ation, and spectficiy (R, M, and S). Because Bpm\ is quite distinct to other type lis restriction enzymes, it is proposed 
that Bpni belongs to a subgroup of type II restriction enzymes called type I If (f stands for fusion of restriction-modifi- 
cation-specificity domains) 

5. Expression of BpmlMI gene in E. coll. 

[0044] Two primers are synthesized to amplify BpmlMI gene in PCR. The primer sequences are: 

forward: 

5' agcggatccggaggtaaataaatgaatcaattaattgaaaatgttaat 3' 
(238-177) (SEQ ID NO: 13) 

reverse: 

5' aagggggcatgcttatacttatttcttcgttctattgtttct 3' (238-178) 
(SEQ ID NO: 14) 

[0045] Following digestion with BamHI and Sph\, the PCR product was ligated into pACYCI 84 with the compatible 
ends. The ligated DNA was transformed into ER2566 competent cells. Cm R transformants were plated at 37°C over- 
night. Plasmids with BpmlMI gene inserts were tested for resistance to Bpm\ digestion. Two out of 1 8 clones showed 
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full resistance to Bprrii digestion, indicating efficient Bprrii M1 expression in E. cofi cells and Bprri site modification on 
the expression plasmid. The host cell ER2566 [pACYC-8pmlM1] was used for expression of BpmIRM gene. 
[0046] BP™* M 1 methylase also modifies Xhoi site. Xnoi recognition sequence 5'CTCGAG3' is similar to Bprrii rec- 
ognition sequence 5'CTGGAG3' with only one base difference. It is concluded that Bprri M1 methylase may recognize 
s the sequence 5*CTNNAG3' and modify the adenine base to generate N6-adenine in the symmetric recognition se- 
quence. 

6. Expression of BpmIRM gene In E. coil using a T7 expression vector. 
io [0047] Two primers were synthesized to amplify the BpmIRM gene. The primer sequences were: 



5' caaggatccggaggtaaataaatgcatataagtgagttagtagataaatac 3' 
(247-217) (SEQ ID NO: 15) 



20 

5' ttaggatcctcatttttcttctcctaacgccgctgt 3' (238-182) 
(SEQ ID NO: 16) 

25 

[0048] The 3,030-bp BpmIRM gene was amplified in PCR using Taq DNA polymerase, digested with BamHI and 
ligated into BamHWigested 17 expression vectors pAllt 7 and pET21a. After transformation of the ligated DNA into 
ER2566 [pACYC-BpmlM1], Ap" Cm" transformants were screened for the endonuclease gene insert. Seven out of 
30 72 clones contained the insert with correct orientation. However, no Bprri activity was detected in cell extracts of IPTG- 
induced ceils. This was probebty due to mutations introduced during the PCR process. 

[0049] To reduce the mutation frequency, Deep Vent® DNA polymerase was used in PCR reactions to amplify the 
3,030-bp BpmIRM gene. The forward primer incorporated an XbsA site and its sequence is the following: 

35 

5' caccaatctagaggaggtaaataaatgcatataagtgagttagtagata 
aatac 3' (238-181) (SEQ ID NO: 17) 

40 

[0050] PCR was performed using primers 238-1 81 , 238-1 82, and Deep Vent® DNA polymerase. The PCR conditions 
were 94°C 5 min for one cycle; 94°C 1 min, 55°C 1 .5 min, 72°C 8 min for 20 cycles. The PCR product was purified 
through a Qiagen spin column and digested with BamHI and Xba\ and ligated to T7 expression vectors pAII17 and 
45 pET21 at with compatible ends. Eighteen out of 36 clones contain the correct size insert. Ten ml cell culture for ail 1 8 
clones containing inserts were induced with IPTG for 3h and cell extracts were prepared by sonication and assayed 
for Bprri activity. Clone #4 displayed partial Bprri activity. Because this gene was derived by PCR cloning, the entire 
BpmIRM fusion gene was sequenced on both strands and it was confirmed to be wild type sequence. 

50 7. Partial purification of recombinant Bprrii activity. 

[0051] Five hundred ml of cell culture was made for the expression clone #4 ER2566 [pACYC-BpmlM1 , pET21at- 
BpmlRM]. The late log cells were induced with IPTG and Cell extract (40 ml) containing Bprri was purified through a 
heparin Sepharose column. Proteins were eluted with a NaCI gradient of 50 mM to 1 M. Fractions 6 to 27 contained 
55 the most protein concentration and were assayed for Bprri activity on X DNA. It was found that fractions 15 to 18 
contained the most active Bprri activity (Figure 4). The yield was estimated at 1 ,800 units of Bprri per gram of wet E. 
coll cells. The specific activity was estimated at 24,000 units per mg of protein. Proteins from fractions 15 to 1 8 were 
analyzed on a SDS-PAGE gel and protein bands were stained with Gelcode blue stain. A protein band corresponding 
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to —115 kDa was detected on the protein gel, in close agreement with the predicted size of 117 kDa. 
[0052] The E coU strain ER2566 [pACYC-BpmlM1, pET21 at-BpmIRM] has been deposited under the terms and 
conditions of the Budapest Treaty with the American Type Culture Collection on October 12, 2000 and received Ac- 
cession No. PTA-2598. 

5 

Example 2 

Deletion of the methylase portion of Bpni RM fusion protein 

10 [0053] Two primers were synthesized to amplify the putative endonuclease domain with deletion of the methylase 
and specificity domains. The deletion clone thus contains only the R portion and the M and S portions were removed. 
The forward primer was 238-1 81 as described above. The reverse primer had the following sequence with a Xhoi site 
at the 5* end: 

15 

5' tgaaatctcgagttatcctgatccacaacatatatctgctat 3' (244-95) 
(SEQ ID NO:18) 

20 

[0054] The deletion junction was in motif I of y type N6 adenine methylase. The y type N6 adenine methytases contain 
conserved motifs of X f I, II, III, IV, V, VI, VII, VIII. The specificity domain (TRD) is located after motif VIII. The Bpni 
deletion clone (Bpm1-A#1) still carried motifs X and part of motif I. The specificity domain after motif VIII was also 

25 deleted (the remaining portion is shown in Figure 1). 

[0055] PGR was performed using primers 238-1 81 and 244-95 and Taq plus Vent® DNA polymerase (94° C 1 min, 
60°C1 min, and 72*C 1 min for 25 cycles). The PCR product was digested with Xba\ and Xhck and doned into a T7 
expression vector pET21b. Sixteen clones out of 36 screened contained the correct size insert and the ceils were 
induced wtth IPTG tor 3h. Cefl extract was prepared by sonicatJon and assayed for Bpni activity on X DNA. However, 

so no apparent Bpni dgeetion pattern was detected. Only non-specific nuclease was detected in ceil extract, resulting a 
smearing of DNA substrate. It was concluded that deletion of the methylase and specificity portion of the BpmIRM 
fusion proton abolished Bpni restriction activity. 

[0056] To further confirm the above result, another deletion clone was constructed that deleted methylase motifs IV, 
V, VI, VII, VIII, and the specificity domain. This EcoRI fragment deletion mutant contains 1,521 bp (507 amino acid) 
35 deletion at the C-terminus half of the fusion protein (Bpm\-A#2). IPTG-induced cell extract of this mutant also did not 
display Bpni endonuclease activity. 

[0057] To delete the specificity domain (target-recognizing domain, TRD), a Hindi 1 1 fragment of 579 bp (1 93 amino 
acid) was deleted from the Oterminus of Bpni RM fusion endonuclease {Bpni-A»3). IPTG-induced cell extract of the 
TRD deletion mutant did not show any Bpni endonuclease activity. However, the mutant protein displayed non-specific 
40 nuclease activity. It was concluded that the specificity (TRD) domain is also required for Bpni endonuclease activity. 
Deletion of the specificity (TRD) domain may abolish or reduce its DNA binding affinity and specificity. By swapping in 
of other N6 methylase and specificity domains, one may be able to create new enzyme specificity. 

Example 3 

45 

Generation of new enzyme specificity using Bpni RM fusion protein 

[0058] Since Bpni endonuclease consists of three domains (R-M-S), it is possible to plug in other methytation-spe- 
cificity domains to create a new enzyme specificity. The BpmIRM fusion gene is cloned in a T7 expression vector as 

so described in Example 1 . Plasmid DNA is prepared. They type N6 adenine methyl ases contain conserved motifs of X, 
I, II, III, IV, V, VI, VII, VIII (Malone T. et al. J.Mot.Bhl 253:61 8-632 (1995)). Motifs X through VIII and TRD are deleted 
and a DNA linker coding for one or more bridging amino acids is inserted with a restriction site, preferably blunt (for 
example Smal site). The number of amino acids will differ from one system to the next and can be determined by 
routine experimentation. The goal is to provide sufficient steric space for the introduction of the new M-S domains. 

55 DNA coding for other y type N6 adenine methytases containing motifs of X, I, II, III, IV, V, VI, VII, VIII and TRD are 
ligated to the digested blunt site (in frame) of the Bpni deletion clone. The ligated DNA is transformed into a non-T7 
expression vector. After the insert is verified, the plasmid containing new methylatfon-specif icity domains is transformed 
into a T7 expression host and induced with IPTG. Cell extract is assayed on plasmid and phage DNA and analyzed 
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for new restriction activity. 



10 



15 



20 



SEQUENCE LISTING 

<110> New England Biolabs, Inc. 

<120> Method for Cloning and Expression of Bpml Restriction Endonuclease In E 
.coll 

<130> 43635/EP 

<160> 18 

<170> Patentln version 3.1 

<210> 1 

<211> 1650 

<212> UNA 

<213> Bacillus pumilus 
<220> 

<221> CDS 

<222> (1)..{1650) 

<223> 



40 



50 



55 



<400> 1 

atg aat caa tta att gaa aat gtt aat eta caa aaa tta agg ggt ggg 48 
Met Asn Gin leu lie Glu Asn Val Asn Leu Gin Lys Leu Arg Gly Gly 
15 10 15 

tat tac acc cct aaa gtt att get gac ttt tta tgt caa tgg agt att 96 
Tyr Tyr Thx Pro Lya Val lie Ala Asp Phe Leu Cys Gin Trp Ser lie 
20 25 30 

caa gat gac aca aag agt gta ctt gaa ccc agt tgt gga gat ggt aat 144 
Gin Asp Asp Thr Lys Ser Val Leu Glu Pro Ser Cys Gly Asp Gly Asn 
35 40 45 

ttt att gaa teg gca ata ctt agg ttc aaa gaa ctt agt ata gat aat 192 
Phe He Glu Ser Ala He Leu Arg Phe Lys Glu Leu Ser He Asp Asn 
50 55 60 

gaa caa ctt aaa gga aga att aca gga gta gag eta att gaa gaa gaa 240 
Glu Gin Leu Lys Gly Arg He Thr Gly Val Glu Leu He Glu Glu Glu 
65 70 75 80 

get ttg aaa gtt caa aat cga gca aat gag ttg ggg gtt gat aaa aac 288 
Ala Leu Lys Val Gin Asn Arg Ala Asn Glu Leu Gly Val Asp Lys Asn 
85 90 ^ 95 

tea ata gta aat agt gac ttc ttt caa ttt gta aaa gat aat aag aat 336 
Ser He Val Asn Ser Asp Phe Phe Gin Phe Val Lys Asp Asn Lys Asn 
100 105 110 

aaa aaa ttt gat act att att ggt aat cca cca ttc ata aga tac caa 384 
Lys Lys Phe Asp Thr lie lie Gly Asn Pro Pro Phe He Arg Tyr Gin 
115 120 125 

aac ttt cct gaa gag cat cgt agt ata gec atg gaa atg atg gag gaa 432 
Asn Phe Pro Glu Glu His Arg Ser He Ala Met Glu Met Met Glu Glu 
130 135 140 
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10 



15 



20 
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eta ggt tta aaa cct aat aaa ctt aca aat ate tgg gtt cca ttt eta 480 
Leu Gly Leu Lys Pro Asn Lys Leu Thr Asn lie Trp Val Pro Phe Leu 
145 150 155 160 

gtg gta tct get aca tta ctt aat gaa caa gga aag atg get atg gtt 52 B 

Val Val Ser Ala Thr Leu Leu Asn Glu Gin Gly Lys Met Ala Met Val 
165 170 175 

ata ccg get gaa tta ttt cag gta aag tat gca gca gaa aca aga att 576 
He Pro Ala Glu Leu Phe Gin Val Lys Tyr Ala Ala Glu Thr Arg He 
180 185 190 

ttt tta tea aag ttt ttc gat cgt ate act ata att aca ttt gaa aaa 624 
Phe Leu Ser Lys Phe Phe Asp Arg lie Thr He He Thr Phe Glu Lys 
195 200 205 

ctt gtt ttt gaa aat ate caa cag gaa gtt ata eta ctt ctt tgt gaa 672 
Leu Val Phe Glu Asn He Gin Gin Glu Val He Leu Leu Leu Cys Glu 
210 215 220 

aag aaa gtt aat aaa ggt aaa gga att egg gtt att gaa tgc gag aac 720 
Lys Lys Val Asn Lys Gly Lys Gly He Arg Val He Glu Cys Glu Asn 
225 230 235 240 

tta gat gga tta aat tec att gat ttt gta get ata aat ggt tea aat 766 
Leu Asp Gly Leu Asn Ser He Asp Phe Val Ala He Asn Gly Ser Asn 
25 245 250 255 

gtt aaa cct att gaa cac cgt act gaa aag tgg aca aag tat ttc tta 816 
Val Lys Pro He Glu His Arg Thr Glu Lys Trp Thr Lys Tyr Phe Leu 
260 265 270 

30 aac gaa gat gaa ata ctt ctt tta cag agt tta aag gaa gac aaa cge 864 
Asn Glu Asp Glu He Leu Leu Leu Gin Ser Leu Lys Glu Asp Lys Arg 
275 280 285 

gtt aaa aat tgt aat gac tat ttt aag aca gaa gtt ggc tta gtt act 912 
Val Lys Asn Cys Asn Asp Tyr Phe Lys Thr Glu Val Gly Leu Val Thr 
35 290 295 300 

gga cga aac gaa ttc ttt atg atg aaa gaa aac caa gta aaa gaa tgg 960 
Gly Arg Asn Glu Phe Phe Met Met Lys Glu Asn Gin Val Lys Glu Trp 
305 310 315 320 

441 aat eta gaa gaa tat aca ata cct gtt aca ggt agg tec aat cag tta 1008 
Asn Leu Glu Glu Tyr Thr He Pro Val Thr Gly Arg Ser Asn Gin Leu 
325 330 335 

aaa ggt ata aca ttt aca gaa aat gat ttt cat gaa aat tea atg gaa 1056 
„ Lys Gly He Thr Phe Thr Glu Asn Asp Phe His Glu Asn Ser Met Glu 
340 345 350 

caa aag gca att cac eta ttt ttg cca cca gat gaa gat ttt gaa aag 1104 
Gin Lys Ala He His Leu Phe Leu Pro Pro Asp Glu Asp Phe Glu Lys 
355 360 365 

50 

tta ccg att gag tgt caa aat tat ate aag tat ggg gaa gaa aaa ggc *1152 
Leu Pro He Glu Cys Gin Asn Tyr He Lys Tyr Gly Glu Glu Lys Gly 
370 375 380 

ttc cat caa ggc tat aaa ace aga att aga aaa cgt tgg tat ata act 1200 

55 
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Phe His Gin Gly Tyr Lys Thr Arg lie Arg Lys Arg Trp Tyr lie Thr 
385 390 395 400 

5 cca tct aga tgg gtt cca gat get ttt get tta aga cag gtt gat ggc 1248 
Pro Ser Arg Trp Val Pro Asp Ala Phe Ala Leu Arg Gin Val Asp Gly 
405 410 415 

tat cca aaa eta att tta aat gaa acc gac get tct tct act gat aca 1296 
Tyr Pro Lys Leu lie Leu Asn Glu Thr Asp Ala Ser Ser Thr Asp Thr 
10 420 425 430 

att cat agg gtt aga ttt aaa gaa ggt ata aat gaa aag tta gec gta 1344 
lie His Arg Val Arg Phe Lys Glu Gly lie Asn Glu Lys Leu Ala Val 
435 440 445 

5 gtt tea ttt ttg aac tea etc act ttt gca tct tea gaa ata acg ggg 1392 
Val Ser Phe Leu Asn Ser Leu Thr Phe Ala Ser Ser Glu lie Thr Gly 
450 455 460 

aga agt tat ggt ggt ggt gtt atg aca ttc gaa cca act gaa att gga 1440 
20 Arg Ser Tyr Gly Gly Gly Val Met Thr Phe Glu Pro Thr Glu He Gly 
465 470 475 480 

gaa ate eta ata cct tec ttt gat aac tta tec att gat ttt gat aaa 1488 

Glu He Leu He Pro Ser Phe Asp Asn Leu Ser He Asp Phe Asp Lys 

485 490 495 

25 

att gat gee tta att cga gaa aag gag att gaa aaa gtc ctt gat att 1536 

He Asp Ala Leu He Arg Glu Lys Glu lie Glu Lys Val Leu Asp He 

500 505 510 

gtt gat gaa get tta ctt ata aaa tat cat ggg ttt agt gag aaa gaa 1584 
30 val Asp Glu Ala Leu Leu He Lys Tyr His Gly Phe Ser Glu Lys Glu 
515 520 525 

gta aaa cag ctt cga ggg ata tgg aag aaa ctt tct cag aga aga aac 1632 
Val Lys Gin Leu Arg Gly He Trp Lys Lys Leu Ser Gin Arg Arg Asn 
530 535 540 

35 

aat aga acg aag aaa taa 1650 

Asn Arg Thr Lys Lys 

545 

40 <210> 2 

<211> 549 
<212> PRT 

<213> Bacillus puxnilus 
^ <400> 2 

Met Asn Gin Leu He Glu Asn Val Asn Leu Gin Lys Leu Arg Gly Gly 
15 10 15 

so Tyr Tyr Thr Pro Lys Val He Ala Asp Phe Leu Cys Gin Trp Ser He 
20 25 30 

Gin Asp Asp Thr Lys Ser Val Leu Glu Pro Ser Cys Gly Asp Gly Asn 
35 40 45 
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Phe lie Glu Ser Ala lie Leu Arg Phe Lys Glu Leu Ser lie Asp Asn 
5 50 55 60 



Glu Gin Leu Lys Gly Arg lie Thr Gly Val Glu Leu lie Glu Glu Glu 
65 70 75 80 



10 



Ala Leu Lys Val Gin Asn Arg Ala Asn Glu Leu Gly Val Asp Lys Asn 
85 90 95 



15 



Ser lie Val Asn Ser Asp Phe Phe Gin Phe Val Lys Asp Asn Lys Asn 



100 



105 



110 



Lys Lys Phe Asp Thr lie lie Gly Asn Pro Pro Phe lie Arg Tyr Gin 
115 120 125 

20 

Asn Phe Pro Glu Glu His Arg Ser lie Ala Met Glu Met Met Glu Glu 
130 135 140 



25 Leu Gly Leu Lys Pro Asn Lys Leu Thr Asn He Trp Val Pro Phe Leu 
145 ' 150 155 160 



Val Val Ser Ala Thr Leu Leu Asn Glu Gin Gly Lys Met Ala Met Val 
165 170 175 

30 

lie Pro Ala Glu Leu Phe Gin Val Lys Tyr Ala Ala Glu Thr Arg He 
180 185 190 



35 Phe Leu Ser Lys Phe Phe Asp Arg He Thr He He Thr Phe Glu Lys 
195 200 205 



40 



Leu Val Phe Glu Asn He Gin Gin Glu Val He Leu Leu Leu Cys Glu 
210 215 220 



Lys Lys Val Asn Lys Gly Lys Gly He Arg Val He Glu Cys Glu Asn 
225 230 235 240 



Leu Asp Gly Leu Asn Ser He Asp Phe Val Ala He Asn Gly Ser Asn 
245 250 255 



50 



Val Lys Pro He Glu His Arg Thr Glu Lys Trp Thr Lys Tyr Phe Leu 
260 265 270 



Asn Glu Asp Glu He Leu Leu Leu Gin Ser Leu Lys Glu Asp Lys Arg 
275 280 285 



55 
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Val Lys Asn Cys Asn Asp Tyr Phe Lys Thr Glu Val Gly Leu Val Thr 
290 295 300 



Gly Arg Asn Glu Phe Phe Met Met Lys Glu Asn Gin Val Lys Glu Trp 
305 310 315 320 



10 Asn Leu Glu Glu Tyr Thr He Pro Val Thr Gly Arg Ser Asn Gin Leu 

325 330 335 



15 



Lys Gly He Thr Phe Thr Glu Asn Asp Phe His Glu Asn Ser Met Glu 
340 345 350 



Gin Lys Ala lie His Leu Phe Leu Pro Pro Asp Glu Asp Phe Glu Lys 
355 360 365 



20 



Leu Pro He Glu Cys Gin Asn Tyr He Lys Tyr Gly Glu Glu Lys Gly 
370 3*75 380 



Phe His Gin Gly Tyr Lys Thr Arg He Arg Lys Arg Trp Tyr He Thr 
25 385 390 395 400 



Pro Ser Arg Trp Val Pro Asp Ala Phe Ala Leu Arg Gin Val Asp Gly 
405 410 415 



30 



40 



50 



Tyr Pro Lys Leu lie Leu Asn Glu Thr Asp Ala Ser Ser Thr Asp Thr 
420 425 430 



He His Arg Val Arg Phe Lys Glu Gly He Asn Glu Lys Leu Ala Val 
35 435 440 445 



Val Ser Phe Leu Asn Ser Leu Thr Phe Ala Ser Ser Glu He Thr Gly 
450 455 460 



Arg Ser Tyr Gly Gly Gly Val Met Thr Phe Glu Pro Thr Glu He Gly 
465 470 475 480 



45 Glu He Leu He Pro Ser Phe Asp Asn Leu Ser He Asp Phe Asp Lys 

485 490 495 



He Asp Ala Leu He Arg Glu Lys Glu He Glu Lys Val Leu Asp He 
500 505 510 



Val Asp Glu Ala Leu Leu He Lys Tyr His Gly Phe Ser Glu Lys Glu 
515 520 525 



14 



EP 1 199 365 A2 



Val Lys Gin Leu Arg Gly lie Trp Lys Lys Leu Ser Gin Arg Axg Asn 
530 535 540 



Asn Arg Thr Lys Lys 
545 



10 



<210> 3 

<211> 3030 

<212> DMA 

<213> Bacillus puznilus 



15 



<220> 

<221> CDS 

<222> (1)..(3030) 

<223> 



<400> 3 

20 atg cat ata agt gag tta gta gat aaa tac aaa gcg cat aga agt act 

Het His lie Ser Glu Leu Val Asp Lys Tyr Lys Ala His Arg Ser Thr 
15 10 15 



48 



25 



ttt tta aaa cca act tat aat gaa act caa eta agg aat gat ttt ata 
Phe Leu Lys Pro Thr Tyr Asn Glu Thr Gin Leu Arg Asn Asp Phe He 
20 25 30 



96 



gac cca ctt eta aaa tct tta gga tgg'gat gtt gat aat acc aaa gga 
Asp Pro Leu Leu Lys Ser Leu Gly Trp Asp Val Asp Asn Thr Lys Gly 
35 40 45 



144 



30 



aaa aca cat att eta aga gat gtc att caa gaa gaa tac ata gaa ata 
Lys Thr His lie Leu Arg Asp Val He Gin Glu Glu Tyr He Glu He 
50 55 60 



192 



aaa gat gag gag aca aag aaa aat cca gat tat aca ctt cgt ata aac 
Lys Asp Glu Glu Thr Lys Lys Asn Pro Asp Tyr Thr Leu Arg lie Asn 
65 70 75 80 



240 



40 



ggt acg aga aag ctg ttt gta gag gtt aag aaa ccg tct ttt aat att 288 

Gly Thr Arg Lys Leu Phe Val Glu Val Lys Lys Pro Ser Phe Asn He 
85 90 95 

ttg aaa tea get aaa gca gec ttc caa aca aga aga tat ggt tgg agt 336 

Leu Lys Ser Ala Lys Ala Ala Phe Gin Thr Arg Arg Tyr Gly Trp Ser 
100 105 110 



get aac ctt ggt att tea gta ctt aca aat ttc gag cat eta gtt att 
Ala Asn Leu Gly He Ser Val Leu Thr Asn Phe Glu His Leu Val He 
115 120 125 



384 



50 



tat gat tgt aga tat acg cct gac aaa tec gac aat gaa cat att get 432 
Tyr Asp Cys Arg Tyr Thr Pro Asp Lys Ser Asp Asn Glu His He Ala 
130 135 140 

aga tat aaa gtt ttc tct tac gag gaa tat gaa gaa gca ttt gat gaa 480 
Arg Tyr Lys Val Phe Ser Tyr Glu Glu Tyr Glu Glu Ala Phe Asp Glu 
145 150 155 160 



55 



ata aag gat ata att tea tat gag tea gee aac tea ggt get ctg gac 



528 



15 
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10 



15 



lie Lys Asp lie lie Ser Tyr Glu Ser Ala Asn Ser Gly Ala Leu Asp 
165 170 175 

gaa atg ttt gat gta aat aca aga gtt ggt gaa acg ttt gac gag tat 576 
Glu Met Phe Asp Val Asn Thr Arg Val Gly Glu Thr Phe Asp Glu Tyr 
180 185 190 

ttt tta cag caa att gag aat tgg cgc gaa aag eta get aaa act gca 624 
Phe Leu Gin Gin lie Glu Asn Trp Acg Glu Lys Leu Ala Lys Thr Ala 
195 200 205 

att aaa aat aac ace gaa tta ggt gaa gag gac gtc aat ttt att gtc 672 
lie Lys Asn Asn Thr Glu Leu Gly Glu Glu Asp Val Asn Phe lie Val 
210 215 220 

caa aga eta tta aac aga att att ttt ctt aga gtt tgt gaa gat aga 720 
Gin Arg Leu Leu Asn Arg He He Phe Leu Arg Val Cys Glu Asp Arg 
225 230 235 240 

acc att gaa aaa tat gaa aca att aaa agt ata aaa aac tat gag gaa 768 
20 Thr He Glu Lys Tyr Glu Thr He Lys Ser He Lys Asn Tyr Glu Glu 

245 250 255 

tta aaa gat ctg ttt caa aag tct gat agg aaa ttt aat tea ggt etc 816 

Leu Lys Asp Leu Phe Gin Lys Ser Asp Arg Lys Phe Asn Ser Gly Leu 

260 265 270 

25 

ttt gac ttc ata gat gat acg etc ttg ctt gag gtt gaa att gat teg 864 

Phe Asp Phe He Asp Asp Thr Leu Leu Leu Glu Val Glu He Asp Ser 

275 280 285 

aat gta ttg ata gaa att ttt agt gat tta tat ttc cca caa age cca 912 
30 Asn Val Leu He Glu He Phe Ser Asp Leu Tyr Phe Pro Gin Ser Pro 
290 295 300 

tat gat ttt tct gtt gtc gat cca aca ata tta age cag ata tat gaa 960 

Tyr Asp Phe Ser Val Val Asp Pro Thr He Leu Ser Gin He Tyr Glu 

305 310 315 320 

35 

cgt ttt eta ggt caa gaa ata att ata gag tea ggt ggt aca ttt cac 1008 

Arg Phe Leu Gly Gin Glu He He He Glu Ser Gly Gly Thr Phe His 
325 330 335 

att acg gag tea cca gaa gtt gcg gcg tec aat ggt gtt gtt cca act 1056 
He Thr Glu Ser Pro Glu Val Ala Ala Ser Asn Gly Val Val Pro Thr 
340 345 350 

cca aaa att ate gtc gaa cag ata gtg aaa gac act tta acg ccc ctt 1104 
Pro Lys He He Val Glu Gin He Val Lys Asp Thr Leu Thr Pro Leu 
45 355 360 365 

acg gaa ggc aaa aaa ttt aat gag eta tgt aac tta aaa ata gca gat 1152 
Thr Glu Gly Lys Lys Phe Asn Glu Leu Cys Asn Leu Lys He Ala Asp 
370 375 380 

so ata tgt tgt gga tea gga act ttc eta att tea agt tat gac ttt eta 1200 
He Cys Cys Gly Ser Gly Thr Phe Leu He Ser Ser Tyr Asp Phe Leu 
385 390 395 400 



55 



gta gag aaa gta atg gaa aag ata ata gaa gag aac ate gat gat tea 1248 
Val Glu Lys Val Met Glu Lys He He Glu Glu Asn He Asp Asp Ser 
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10 



15 



20 



405 410 415 

gat tta gta tat gaa act gaa gaa ggg eta att ttg aca ctt aaa gca 1296 

Asp Leu Val Tyr Glu Thr Glu Glu Gly Leu lie Leu Thr Leu Lys Ala 
420 425 430 

aaa aga aat ate ttg gag aat aat ttg ttt ggt gtt gat gtt aat cca 1344 

Lys Arg Asn lie Leu Glu Asn Asn Leu Phe Gly Val Asp Val Asn Pro 

435 440 445 

tac get gtt gaa gta get gag ttc agt tta tta tta aag eta tta gaa 1392 

Tyr Ala Val Glu Val Ala Glu Phe Ser Leu Leu Leu Lys Leu Leu Glu 

450 455 460 

ggt gag aat gag gca teg gtt aat aat ttc att cac gag cat gag gat 1440 

Gly Glu Asn Glu Ala Ser Val Asn Asn Phe He His Glu His Glu Asp 

465 470 475 ■ 480 

aaa ata tta ccg gat tta aca tct att att aaa tgt gga aac age tta 1480 

Lys He Leu Pro Asp Leu Thr Ser He He Lys Cys Gly Asn Ser Leu 
485 490 495 

gta gat aat aag ttt ttt gaa ttc atg cca gaa teg tta gag gac gat 1536 

Val Asp Asn Lys Phe Phe Glu Phe Met Pro Glu Ser Leu Glu Asp Asp 
500 505 510 

gaa ate tta ttt aag get aat cca ttt gaa tgg gaa gag gag ttt cca 1584 

Glu He Leu Phe Lys Ala Asn Pro Phe Glu Trp Glu Glu Glu Phe Pro 

515 520 525 

gat att atg gca aat ggt ggc ttt gat get att ata gga aat cca cct 1632 
Asp He Met Ala Asn Gly Gly Phe Asp Ala He He Gly Asn Pro Pro 

530 535 540 

tat gtt cga ata cag aac atg aaa aaa tat agt cct gag gaa att gaa 1680 

Tyr Val Arg He Gin Asn Met Lys Lys Tyr Ser Pro Glu Glu He Glu 

545 550 555 560 

tat tat caa tea aaa gac tct gaa tat act gtt gca aaa aaa gaa aca 1728 

Tyr Tyr Gin Ser Lys Asp Ser Glu Tyr Thr Val Ala Lys Lys Glu Thr 
565 570 575 

gtt gac aag tat ttt tta ttt att gag aga gca tta ata tta etc aat 1776 

Val Asp Lys Tyr Phe Leu Phe He Glu Arg Ala Leu He Leu Leu Asn 
580 585 590 

cct act ggg ctg ttg ggt tat ata ata ccg cat aaa ttc ttt att aca 1824 

Pro Thr Gly Leu Leu Gly Tyr He He Pro His Lys Phe Phe He Thr 

595 600 605 

45 

aaa ggt ggt aag gaa eta aga aag ttc ata get gaa aaa cat caa ata 1872 

Lys Gly Gly Lys Glu Leu Arg Lys Phe He Ala Glu Lys His Gin He 

610 615 620 

tea aaa att ata aat ttt ggt gtt aca cag gtc ttt cca gga aga gcg 1920 

so Ser Lys He He Asn Phe Gly Val Thr Gin Val Phe Pro Gly Arg Ala 

625 630 635 640 

aca tat acg get att tta att ate caa gca aat aaa atg gca cag ttc 1968 

Thr Tyr Thr Ala He Leu He He Gin Ala Asn Lys Met Ala Gin Phe 
645 650 655 

55 
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aag tat aag aaa gta agt aat ata tea gca gaa acc eta gat tct gaa 2016 
Lys Tyr Lys Lys Val Ser Asn lie Ser Ala Glu Thr Leu Asp Ser Glu 
660 665 670 

gaa aat acg tgt gtt tat age tea gaa aag tat aat tct gae ect tgg 2064 
Glu Asn Thr Cys Val Tyr Ser Ser Glu Lys Tyr Asn Ser Asp Pro Trp 
675 680 685 

ata ttt tta tct ect gaa aca gaa get gtt ttt act aaa ttt aca gaa 2112 
lie Phe Leu Ser Pro Glu Thr Glu Ala Val Phe Thr Lys Phe Thr Glu 
690 695 700 

get caa ttt gag aaa ctt gga gaa ate act gat ata agt gta gga eta 2160 
Ala Gin Phe Glu Lys Leu Gly Glu He Thr Asp He Ser Val Gly Leu 
705 710 715 720 

caa aca age get gat aaa ata tat att ttt att ect gaa aat gaa act 2208 
Gin Thr Ser Ala Asp Lys He Tyr He Phe He Pro Glu Asn Glu Thr 
725 730 735 

tea gat aca tat ata ttt aat tat aaa ggg aaa aga tat gaa ata gaa 2256 
Ser Asp Thr Tyr He Phe Asn Tyr Lys Gly Lys Arg Tyr Glu He Glu 
740 745 750 

aaa tct ata tgt tgc cea get ate tat gac tta tct ttt ggt tct ttt 2304 
Lys Ser He Cys Cys Pro Ala He Tyr Asp Leu Ser Phe Gly Ser Phe 
755 760 765 

gaa age att cag gga aat gca caa atg ata ttc ect tat gaa ate aga 2352 
Glu Ser He Gin Gly Asn Ala Gin Met He Phe Pro Tyr Glu He Arg 
770 775 780 



gat gaa gaa gca tat eta eta gag gaa gaa acg ctt gaa aat gat tat 2400 
Asp Glu Glu Ala Tyr Leu Leu Glu Glu Glu Thr Leu Glu Asn Asp Tyr 
785 790 795 800 

ect ctt get tgg aat tat ttg aat gag ttt aaa gaa get ctt gaa aaa 2448 
Pro Leu Ala Trp Asn Tyr Leu Asn Glu Phe Lys Glu Ala Leu Glu Lys 
805 810 815 

aga age tta caa ggc cgt aat ccg aaa tgg tat caa tat ggt egg tec 2496 
Arg Ser Leu Gin Gly Arg Asn Pro Lys Trp Tyr Gin Tyr Gly Arg Ser 
820 825 830 

caa agt tta tea aaa ttt cat gat aaa gaa aaa ctg ata tgg acc gta 2544 
Gin Ser Leu Ser Lys Phe His Asp Lys Glu Lys Leu He Trp Thr Val 
835 840 845 

45 ctt get acg aaa ccc ccg tat gta ctt gat agg aat aac ctg tta ttt 2592 
Leu Ala Thr Lys Pro Pro Tyr Val Leu Asp Arg Asn Asn Leu Leu Phe 
850 855 860 

act ggt ggt gga aac gga ccg tat tat ggt tta att aac caa tct att 2640 
Thr Gly Gly Gly Asn Gly Pro Tyr Tyr Gly Leu He Asn Gin Ser He 
so 865 870 875 880 

tac tct ttg cat tat ttt tta ggt att ctt tea cat ect gta ata gaa 2688 

Tyr Ser Leu His Tyr Phe Leu Gly He Leu Ser His Pro Val He Glu 
885 890 895 

55 



18 



EP 1 199 365 A2 



agt atg gta aaa gca agg gcc agt gaa ttt agg gga tea tat tat tct 2736 

Ser Met Val Lys Ala Arg Ala Ser Glu Phe Arg Gly Ser Tyr Tyr Ser 

900 905 910 

cat gga aaa caa ttt att gag aaa ate cca att aga aag att gat ttt 2784 

His Gly Lys Gin Phe lie Glu Lys lie Pro lie Arg Lys lie Asp Phe 

915 920 925 

gat gat caa gat gag gta gac aaa tat aat acg gtg gtc aca aca gta 2832 

Asp Asp Gin Asp Glu Val Asp Lys Tyr Asn Thr Val Val Thr Thr Val 

930 935 940 

gaa aaa tta att ata act acc gat aga att aaa agt gag age aat gga 2880 

Glu Lys Leu lie lie Thr Thr Asp Arg lie Lys Ser Glu Ser Asn Gly 

945 950 955 960 

ccc egg agg aga atg tta aga aga agg tta gat get ttg tct aat caa 2928 

Pro Arg Arg Arg Met Leu Arg Arg Arg Leu Asp Ala Leu Ser Asn Gin 

965 970 " 975 

20 ctt ate cag gtt att aat gaa ctt tat aat ate agt gac gaa gaa tat 2976 

Leu lie Gin Val lie Asn Glu Leu Tyr Asn lie Ser Asp Glu Glu Tyr 

980 985 990 

acg aca gtt ttg aat gat gaa atg ttg aca gcg gcg tta gga gaa gaa 3024 

Thr Thr Val Leu Asn Asp Glu Met Leu Thr Ala Ala Leu Gly Glu Glu 

23 995 1000 1005 

aaa tga 3030 
Lys 



5 



10 



15 



30 



50 



55 



<210> 4 
<211> 1009 
<212> PRT 

<213> Bacillus pumilus 
<400> 4 

Met His lie Ser Glu Leu Val Asp Lys Tyr Lys Ala His Arg Ser Thr 
15 10 15 



Phe Leu Lys Pro Thr Tyr Asn Glu Thr Gin Leu Arg Asn Asp Phe lie 
20 25 30 



Asp Pro Leu Leu Lys Ser Leu Gly Trp Asp Val Asp Asn Thr Lys Gly 
35 40 45 



Lys Thr His lie Leu Arg Asp Val He Gin Glu Glu Tyr He Glu He 
50 55 60 



Lys Asp Glu Glu Thr Lys Lys Asn Pro Asp Tyr Thr Leu Arg He Asn 
65 70 75 80 



Gly Thr Arg Lys Leu Phe Val Glu Val Lys Lys Pro Ser Phe Asn He 
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85 



90 



95 



5 Leu Lys Ser Ala Lys Ala Ala Phe Gin Thr Arg Arg Tyr Gly Trp Ser 
100 105 110 



10 



Ala Asn Leu Gly He Ser Val Leu Thr Asn Phe Glu His Leu Val He 
115 " 120 125 



Tyr Asp Cys Arg Tyr Thr Pro Asp Lys Ser Asp Asn Glu His He Ala 
130 135 140 



15 



Arg Tyr Lys Val Phe Ser Tyr Glu Glu Tyr Glu Glu Ala Phe Asp Glu 
145 150 155 160 



He Lys Asp He He Ser Tyr Glu Ser Ala Asn Ser Gly Ma Leu Asp 
20 165 170 175 



Glu Met Phe Asp Val Asn Thr Arg Val Gly Glu Thr Phe Asp Glu Tyr 
180 185 190 



25 



Phe Leu Gin Gin He Glu Asn Trp Arg Glu Lys Leu Ala Lys Thr Ala 
195 200 205 



He Lys Asn Asn Thr Glu Leu Gly Glu Glu Asp Val Asn Phe He Val 
30 210 215 220 



Gin Arg Leu Leu Asn Arg He He Phe Leu Arg Val Cys Glu Asp Arg 

225 230 235 240 



Thr He Glu Lys Tyr Glu Thr He Lys Ser He Lys Asn Tyr Glu Glu 
245 250 255 



Leu Lys Asp Leu Phe Gin Lys Ser Asp Arg Lys Phe Asn Ser Gly Leu 



260 



265 



270 



Phe Asp Phe He Asp Asp Thr Leu Leu Leu Glu Val Glu He Asp Ser 
275 280 285 

45 

Asn Val Leu He Glu He Phe Ser Asp Leu Tyr Phe Pro Gin Ser Pro 
290 295 300 



50 Tyr Asp Phe Ser Val Val Asp Pro Thr He Leu Ser Gin He Tyr Glu 
305 310 315 320 



55 



Arg Phe Leu Gly Gin Glu He He He Glu Ser Gly Gly Thr Phe His 
325 330 335 
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He Thr Glu Ser Pro Glu Val Ala Ala Ser Asn Gly Val Val Pro Thr 
5 340 345 350 



Pro Lys He He Val Glu Gin He Val Lys Asp Thr Leu Thr Pro Leu 
355 360 365 



10 



Thr Glu Gly Lys Lys Phe Asn Glu Leu Cys Asn Leu Lys He Ala Asp 
370 375 380 



, 5 He Cys Cys Gly Ser Gly Thr Phe Leu He Ser Ser Tyr Asp Phe Leu 
385 390 395 400 



Val Glu Lys Val Met Glu Lys He He Glu Glu Asn He Asp Asp Ser 
405 410 415 



20 



Asp Leu Val Tyr Glu Thr Glu Glu Gly Leu He Leu Thr Leu Lys Ala 
420 425 430 



& Lys Arg Asn He Leu Glu Asn Asn Leu Phe Gly Val Asp Val Asn Pro 
435 440 445 



30 



35 



45 



Tyr Ala Val Glu Val Ala Glu Phe Ser Leu Leu Leu Lys Leu Leu Glu 
450 455 460 



Gly Glu Asn Glu Ala Ser Val Asn Asn Phe He His Glu His Glu Asp 
465 470 475 480 



Lys He Leu Pro Asp Leu Thr Ser He lie Lys Cys Gly Asn Ser Leu 
485 490 495 



Val Asp Asn Lys Phe Phe Glu Phe Met Pro Glu Ser Leu Glu Asp Asp 
500 505 510 



Glu He Leu Phe Lys Ala Asn Pro Phe Glu Trp Glu Glu Glu Phe Pro 
515 520 525 



Asp He Met Ala Asn Gly Gly Phe Asp Ala He He Gly Asn Pro Pro 
530 535 540 



Tyr Val Arg He Gin Asn Met Lys Lys Tyr Ser Pro Glu Glu He Glu 
50 545 550 555 560 



Tyr Tyr Gin Ser Lys Asp Ser Glu Tyr Thr Val Ala Lys Lys Glu Thr 
565 570 575 

55 
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Val Asp Lys Tyr Phe Leu Phe He Glu Arg Ala Leu He Leu Leu Asn 
580 58S 590 



Pro Thr Gly Leu Leu Gly Tyr He He Pro His Lys Phe Phe He Thr 
595 600 605 



Lys Gly Gly Lys Glu Leu Arg Lys Phe He Ala Glu Lys His Gin He 
610 615 620 



Ser Lys He He Astx Phe Gly Val Thr Gin Val Phe Pro Gly Arg Ala 
625 630 635 64 0 



Thr Tyr Thr Ala He Leu He He Gin Ala Asn Lys Met Ala Gin Phe 
645 650 655 



Lys Tyr Lys Lys Val Ser Asn He Ser Ala Glu Thr Leu Asp Ser Glu 
660 665 670 



Glu Asn Thr Cys Val Tyr Ser Ser Glu Lys Tyr Asn Ser Asp Pro Trp 
675 680 685 



He Phe Leu Ser Pro Glu Thr Glu Ala Val Phe Thr Lys Phe Thr Glu 
690 695 700 



Ala Gin Phe Glu Lys Leu Gly Glu He Thr Asp He Ser Val Gly Leu 
705 710 715 720 



Gin Thr Ser Ala Asp Lys He Tyr He Phe He Pro Glu Asn Glu Thr 
725 730 735 



Ser Asp Thr Tyr He Phe Asn Tyr Lys Gly Lys Arg Tyr Glu He Glu 
740 745 750 



Lys Ser He Cys Cys Pro Ala He Tyr Asp Leu Ser Phe Gly Ser Phe 
755 760 765 



Glu Ser He Gin Gly Asn Ala Gin Met He Phe Pro Tyr Glu He Arg 
770 775 780 



Asp Glu Glu Ala Tyr Leu Leu Glu Glu Glu Thr Leu Glu Asn Asp Tyr 
785 790 795 800 



Pro Leu Ala Trp Asn Tyr Leu Asn Glu Phe Lys Glu Ala Leu Glu Lys 
805 810 815 
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Arg Ser Leu Gin Gly Arg Asn Pro Lys Trp Tyr Gin Tyr Gly Arg Ser 
820 825 830 « 



Gin Ser Leu Ser Lys Phe His Asp Lys Glu Lys Leu He Trp Thr Val 
835 840 845 



Leu Ala Thr Lys Pro Pro Tyr Val Leu Asp Arg Asn Asn Leu Leu Phe 
850 855 860 



Thr Gly Gly Gly Asn Gly Pro Tyr Tyr Gly Leu lie Asn Gin Ser He 
865 * 870 " 875 880 



Tyr Ser Leu His Tyr Phe Leu Gly He Leu Ser His Pro Val He Glu 
885 890 895 



Ser Met Val Lys Ala Arg Ala Ser Glu Phe Arg Gly Ser Tyr Tyr Ser 
900 905 910 



His Gly Lys Gin Phe He Glu Lys He Pro He Arg Lys He Asp Phe 
915 920 925 



Asp Asp Gin Asp Glu Val Asp Lys Tyr Asn Thr Val Val Thr Thr Val 
930 935 940 



Glu Lys Leu He He Thr Thr Asp Arg He Lys Ser Glu Ser Asn Gly 
945 950 955 960 



Pro Arg Arg Arg Met Leu Arg Arg Arg Leu Asp Ala Leu Ser Asn Gin 
965 970 975 



Leu He Gin Val He Asn Glu Leu Tyr Asn He Ser Asp Glu Glu Tyr 
980 985 990 



Thr Thr Val Leu Asn Asp Glu Met Leu Thr Ala Ala Leu Gly Glu Glu 
995 1000 1005 



Lys 



<210> 5 
<211> 24 
<212> DNA 

<213> Bacillus pumilus 
<400> 5 

gtggaaacgg accgtattat ggtt 
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70 



15 



<210> 6 
<211> 24 
<212> DNA 

<213> Bacillus pumilus 
<400> 6 

caccagtaaa taacaggtta ttcc 24 

<210> 7 
<211> 27 
<212> DNA 

<213> Bacillus pumilus 
<400> 7 

ttcgtagcaa gtacggtcca tatcagt 27 



<210> 8 

<211> 27 

20 <212> DNA 

<213> Bacillus pumilus 

<400> 8 

ccgtatgtac ttgataggaa taacctg 27 

25 

<210> 9 
<211> 24 
<212> DNA 

<213> Bacillus pumilus 
30 <400> 9 

aggaactaag aaagttcata gctg 24 

<210> 10 

<211> 24 

35 <212> DNA 

<213> Bacillus pumilus 

<400> 10 

atgcggtatt atataaccca acag 24 



40 



45 



<210> 11 
<211> 24 
<212> DMA 

<213> Bacillus pumilus 
<400> 11 

tgacgtcctc ttcacctaat tcgg 24 



<210> 12 

so <211> 24 

<212> DNA 

<213> Bacillus pumilus 



55 



<400> 12 

gagtttgtga agatagaacc attg 24 



24 
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<210> 13 
<211> 48 
<212> DKA 

<213> Bacillus purailus 
<400> 13 

agcggatccg gaggtaaata aatgaatcaa ttaattgaaa atgttaat 

<210> 14 
<211> 42 
<212> DMA 

<213> Bacillus pumilus 
<400> 14 

aagggggcat gcttatactt atttcttcgt tctattgttt ct 

<210> 15 
<211> 51 
<212> DMA 

<213> Bacillus pumilus 
<400> 15 

caaggatccg gaggtaaata aatgcatata agtgagttag tagataaata c 



<210> 16 
<211> 36 
<212> DMA 

<213> Bacillus pumilus 
<400> 16 

ttaggatcct catttttctt ctcctaacgc cgctgt 



<210> 17 
<211> 54 
<212> DMA 

<213> Bacillus pumilus 
<400> 17 

caccaatcta gaggaggtaa ataaatgcat ataagtgagt tagtagataa atac 

<210> 18 
<211> 42 
<212> DMA 

<213> Bacillus pumilus 
<400> 18 

tgaaatctcg agttatcctg atccacaaca tatatctgct at 
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1 . Isolated DNA segment coding for the Bprril restriction endonudease, wherein the isolated DNA is obtainable from 
Bacillus pumikjs (New England Biolabs collection #711). 

2. A recombinant DNA vector comprising a vector into which a DNA segment encoding the BpmIRM restriction en- 
donudease has been inserted. 

3. Isolated DNA segment coding for the Bpni restriction endonudease and fijpm/methyiase M1 , wherein the isolated 
DNA is obtainable from ATCC No. PTA-2598. 

4. A cloning vector which comprises the isolated DNA of daim 3. 

5. A host cell transformed by the vector of claims 2 or 4. 

6. A method of producing recombinant Bprrk restriction endonudease comprising culturing a host cell transformed 
with the vector of claims 2 or 4 under conditions for expression of said endonudease. 

7. A method for modifying the specificity of a target restriction-modification system comprising the steps: 

(a) isolating DNA coding for a Type I if restriction-modification system and deleting the methylation-specifidty 
domains of said Type Iff restriction-modification system; 

(b) inserting a DNA linker coding for an appropriate restriction site and one or more amino adds at the deletion 
site of step (a); and 

(c) inserting an methyl ation-specificity fusion from a second Type llf restriction-modification system adjacent 
the DNA linker of step (b) to form a modified target restriction-modification system. 



26 



• ') > i 

EP1 199 365 A2 



FIG. 1 

___/\ 



BpalRH gene BpalRl gene 



R H 



^#4^;MS3 (Bpal-At2) 
^^^MES^SS2ZS5toZ3 (Bpil-A#3) 
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FIG. 2A 

ATGAATCAATTAATTGAAAATGnAATCTACAAAAATTAAfiGGGTGGGTAnACACCCCT 

1 ♦ ♦ » « 4 60 

HNQLIENVNLQKLRGGYYTP 
AAAGTTATTGCT6ACTTTTTATGTCAATGGAGTATTCAA6ATGACACAAAGAGT6TACTT 

Bl ♦ ♦ 4 » ♦ ♦ 120 

KVIADFLCQVSIQDDTKSVL 
GAACCWGTTGTGGAGATGGTAATTTTATTGAATCGGCAATACTTAGGTTCAAAGAACTT 

121 ♦ ♦ ♦ ♦ ♦ ♦ 180 

EPSCGQGNFIESAILRFKEl 
AGTATAGATAATGMI^CTTAAAGCAAGMTTACAGGAGTAGAGCTAAnGAAGAAGAA 

181 ♦ ♦ ♦ ♦ ♦ 4 240 

S.IDNEQLKGRITGVELIEEE 
GCTTTGAMGTTCAAAATCGAGCAAATGAGTTGGGGGTTGATAAAAACTCAATA6TAAAT 

241 ♦ ♦ ♦ ♦ 300 

ALKVQNRANELGVDKNSIVN 
AGTGACTTCTTTCAATTTGTAAAAGATAATAAGAATAAAAAATTTGATACTATTATTGGT 

301 4 ♦ ♦ * » 4 360 

SDFFQFVKDNKNKKFDTIIG 
AATCCACCATTGATAAGATACCAAAACTTTCCTGAAGAGCATCGTAGTATAGCCATGGAA 

361 * » ♦ ♦ ♦ ♦ 420 

NPPFIRYQNFPEEHRSIAHE 
ATGATGGAGGAACTAGGrTTAAAACCTAATAAACTTACAAATATCTGGGTTCCATTTCTA 

421 — - 4 ♦ 4 t ♦ 480 

MHEELGLKPNKLTNIWVPFL 
GTGGTATCTGCTACATTACT7AATGAACAAGGAAAGATGGCTATGGTTATACCGGCTGAA 

481 4 ♦ ♦ 4 ♦ ♦ 540 

YVSATLLNEQGKHAHVIPAE 
nArrTCAG6TAAAGTATGCAGCAGAMCAA6MnTTTTTATCAAAGTTTTTCGATCGT 

541 4 ♦ ♦ ♦ 600 

LFOVKYAAETRIFLSKFFOR 
ATCACTATMTTACATTTGAAAMCnGTTTTTGAAAATATCCAACAGGAAGTTATACTA 

601 4 ♦ ♦ ♦ f— ♦ 660 

IIIITFEKLVFEMIOOEYIL 
CnCTnGTGAAAAGAAAGTTAATAAAGGTAAAGGAATTCGG6TTATTGAATGC6AGAAC 

661 -— 4 ♦ ♦ ♦ — ♦ 720 

LICEKKVNKGKGIRV IECEN 
nAGATGGATTAAATTCWnGATTTTGTAGCTATAAATGGTTCAAATGTTAAACCTATT 

721 4 ». » ♦ 7B0 

L D G I N S I D F V A I N G S N V K P I 
GAACACCGTAC TGAAAAGTGGACAAAGTATTTCTTAAACGAAGATGAAATACTTCTTTTA 

781 4 -♦ ♦ ♦ ♦ , 840 

EHRTEKWTKYFLNEDEILIL 
CAGAGTTTAAAGGAAGACAMCKGTTAAAAATTGTAATGACTATTTTAAGACAGAAGTT 

841 4 f ♦ ♦ ♦ ♦ 900 

OSLKEDKRVKNCNDYFKTEV 
GGCTTAGTTACTGGACGAAACGAATTCTTTATGATGAAAGAAAACCAAGTAAAAGAATGG 

901 4 ♦ ♦ ♦ 960 

GLVTGRNEFFHMKENOVKEW 
AATCTAGAAGAATATACAATACCTGTTACAGGTAGGTCCAATCAGTTAAAAGGTATAACA 

961 ♦ 4 » ♦ 1020 

NLEEYTIPVTGRSNQLKGIT 
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FIG. 2B 



TTTACAGAAAATGATTTTCATGAAAATTCAATGGAACAAAAGGCAATTCACCTATTTTTG 

1021 ♦ ♦ ♦ ♦ ♦-— ♦ 1080 

FTENOFHENSMEQKAIHIFL 
CCACCAGATGAAGATTTTGAAAAGTTACCGATTGAGTGTCAAAATTATATCAAGTATGGG 

10B1 ♦ ♦ ♦ — ♦ ♦ — ♦ 1140 

PPDEDFEKLPIECQNYIKYG 
GAAGAAAAAGGCTTCCATCAAGGCTATAAAACCAGAATTAGAAAACGTTGGTATATAACT 

1141 ♦ * ♦ ♦ • ♦ 1200 

EEKGFHOGYKTRIRKRWYIT 
CCATCTAGATGGGTTCCAGATGCTTTTGCTTTAAGACAGGTTGATGGCTATCCAAAACfA 

1201 ♦ ♦ • ♦ * • 12G0 

PSRWVPOAFALRQVDGYPKL 
ATTTTAAATGAAACCGACGCTTCTTCTACTGATACAATTCATAGGGTTAGATTTAAAGAA 

1261 ♦ * ♦ ♦ ♦ ♦ 1320 

ILNETOASSTDTIHRYRFKE 
G6TATAAATGAAAAGTTAGCC6TAGTTTCATTTTTGAACTCACTCACTTTT6CATCTTCA 

1321 ♦ » ♦ -♦ 1380 

6INEKLAYVSFLNSLTFASS 
GAMTAACGGGGAGAAGTTATGGTGGTGGTGTTATGACATTCGAACCAACTGAAATTGGA 

1381 -- ♦— ♦ ♦ ♦ ♦ — - 1440 

EITGRSYGGGVHTFEPTEIG 
GAAATCCTAATACCTTCCTTTGATAACTTATCCATTGATTTTGATAAAATTGATGCCTTA 

1441 ♦ ♦ ♦ ♦ ♦ ♦ 1500 

EILIPSFDNLSIDFDKIOAl 
AnCGAGAAAAGGlAGAnGAAAAAGTCCTTGATAnGnGATGAAGCTTTACTTATAAAA 

1501 ♦ ♦ * ♦ ♦ ♦ 15G0 

IREKEIEKVIDIVDEALLIK 
TATCATGG6TTTAGT6AGAAAGAAGTAAAACAGCTTCGAGGGATATGGAAGAAACTTTCT 

1561 ♦ ♦ ♦ ♦ • ♦ 1G20 

YHGFSEKEVK0LR6IWKKLS 
CAGAGAAGAAACAATAGAACGAAGAAATAA 

1G21 ♦ — - ♦ 1650 

QRRNNRTKKi 
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FIG. 3A 

ATGCATATAAGTGAGTTAGTAGATAAATACAAAfiCGCATAGAAGTACTTTTTTAAAACCA 

1 - ♦- ♦ -♦- ♦ ♦ » GO 

HHISELVDKYKAHRSTFIKP 
ACTTATAATGAAACTCAACTAAGGAATGATTTTATAGACCCACTTCTAAAATCTTTAGGA 

Gl ♦ ♦ ♦ --♦ » 120 

TYNETOLRNDFIDPLLKSLG 
JGGGATGTTGATAATACCAAAGGAAAAACACATATTCrAAGAGATGTCATTCAAGAAGAA 

121 ♦ ♦ ♦ ♦ ♦ ♦ 1B0 

VOVDNTKGKTHILRDVIOEE 
TACATAGAAATAAAAGATGAGGAGACAAAGAAAAATCCAGATTATACACTTCGTATAAAC 

181 ♦ 4 ♦ ♦ ♦ ♦ 240 

YIEIKDEETKKNPDYTLRIN 
GGrACGAGAAAGCTGTTTGTAGAGGTTAAGAAACCGTCTTTTAATATTTTGAAATCAGCT 

241 ♦ ♦ , ♦ ♦ 300 

GTRKIFVEVKKPSFNIIKSA 
AAAGCAGCCTTCCAAACAAGAAGATATGGTTGGAGTGCTAACCTTGGTATTTCAGTACTT 

301 ♦ ♦ 4 ♦ ♦ ♦ 3G0 

KAAFOTRRYGVSANLGISVL 
ACAAATTTCGAGCATCTAGTTATTTATGATTGTAGATATACGCCTGACAAATCCGACAAT 

361 ♦ ♦ ♦ ♦ ♦ » 420 

TNFEHLVIYDCRYTPOKSDN 
GAACATATTGCTAGATATAAAGTTTTCTCTTACGAGGAATATGAAGAAGCATTT6ATGAA 

421 ♦ ♦ ♦ ♦ ♦ 480 

EHIAflYKYFSYEEYEEAFDE 
ATAAAGGATATAATTTCATATGAGTCAGCCAACTCAGGTGCTCTGGACGAAATGTTTGAT 

481 ♦ ♦ ♦ ♦ ♦ » 540 

IKOIISYESANSGALOEHFO 
GTAAATACAA6^TTGGTGAAACGTTTGACGAGTATTTTTTACAGCAAATTGAGAATTGG 

541 ♦ ♦ ♦ ♦ ♦ ♦ GOO 

YNTRVGETFDEYFLOOIENW 
CGCGAAAAGCTAGCTAAAACTGCAATTAAAAATAACACCGAATTAGGTGAAGAGGACGTC 

G01 ♦ * ♦ » ♦ GGO 

REKIAKTAIKNNTELGEEDV 
AAnTTATTGTCCAAAGACTATTAMCAGMTTATTTTTCTTAGAGTTTGTGAAGATAGA 

661 ♦ ♦ ♦ ♦ t ♦ 720 

NFIVQRLLNRIIFLRVCEOR 
ACCATTGAAAAATATGAAACAATTAAAAGTATAAAAAACTATGAGGAATTAAAAGATCTG 

721 ♦ ♦ 4 ♦ . ♦ 780 

TIEKYETIKSIKNYEELKDL 
1TTCAAAAGTCTGATAGGAAATTTAATTCAGGTCTCTTTGACTTCATAGAT6ATACGCTC 

781 4- ♦ ♦ ♦ ♦ B40 

FQKSDRKFNSGIFDFIDDTL 
TTGCTTGAGGTTGAAATTGATTCGAATGTATTGATAGAAATTTTTAGTGATTTATATTTC 
841 ♦ ♦ ♦ + ♦ f 900 

LLEVEIDSNVLIEIFSOLYF 
CCACAAAGCCCATATGATTTTTCTGTTGTCGATCCAACAATATTAAGCCAGATATATGAA 

901 ♦ 4 ♦ --♦ ♦ ♦ 9G0 

PQSPYOFSVVDPTILSOIYE 
CGTTTTCTAGGTCAAGAAATAATTATAGAGTCAGGTGGTACATTTCACATTACGGAGTCA 

961 —4 ♦ ♦ — ♦ 1020 

RFLGQEIIIESGGTFHITES 



30 



EP 1 199 365 A2 



FIG. 3B 

CCAGAAGTTGCGGCGTCCAATGGTGTTGTTCCAACTCCAAAAATTATCGTCGMCAGATA 

1021 * ♦ ♦ • ♦ ♦ 1080 

P E V A A S N G V V P T P K I I V E 0 I 
GTGAAAGACACTTTAACGCCCCTTACGGAAGGCAAAAAATTTAATGAGCTATGTAACTTA 

1081 ♦ ♦ ♦ ♦ ♦ 1140 

VKDTLTPLTEGKKFNELCNL 
AAAA1AGCAGATATATGTTGTGGATCAGGAACTTTCCTAATTTCAAGTTATGACTTTCTA 

1141 ♦ » ♦ ♦ ♦ ♦ 1200 

KIAOICCGSGTFLISSYDFL 
GTAGA6AAAGTAATGGAAAAGATAATAGAAGAGAACArCGATGATTCAGATTTA6TATAT 

1201 ♦ * ♦ i ♦ f 1260 

VEKVHEKIIEENIODSDLVY 
GAAACTGAAGAAGGGCTAAnTTGACACTTAAAGCAAAAAGAAATATCTTGGAGAATAAT 

1261 ♦ ♦ ♦ ♦ 1320 

ETEEGLILTLKAKRNILENN 
TTGTTTGGTGTTGATGTTAATCCATACGCTGTTGAAGTAGCTGAGTTCAGTTTATTATTA 

1321 ♦ ♦ ♦ -♦ f 1380 

LFGVDVNPYAVEVAEFSLLl 
AAGCTATTAGAAGGTGAGAATGAGGCATCGGTTAATAATTTCATTCACGAGCATGAGGAT 

1381 ♦ ♦ ♦ ♦ — -» 1440 

KLLEGENEASVNNFIHEHEO 
AAMTATTAIXGttTTTAACATCTATTATTAAATGTGGAAACAGCTTAGTAGATAATAAG 

1441 ♦ f ♦ ♦ ♦ ♦ 1500 

KILPDLTSIIKCGNSIVDNK 
TTTTnGMnWTGCl^GMTCGnAGAGGACGATGAAATCnATTTAAGGCTAATCCA 

1501 ♦ f ♦ ♦ f ♦ 1560 

FFEFHPESLEDDEILFKANP 
TTTGAAT66GAA6AGGAGTncCAGAIATTATGG£AAATGGTGGCTTTGATGCTATTATA 

1561 ♦ ♦ ♦ ♦ t ♦ 1620 

FEWEEEFPOIHANGGFDAII 
GGAAATCCACCTTATGTTCGAATACAGAACATGAAAAAATATAGTCCTGAGGAAATTGAA 

1621 ♦ • ♦ ♦ ♦ » 1680 

GNPPYVfllONMKKYSPEEIE 
TAnATCAATCAAAAGACTCTGAATATACTGTTGCAAAAAAAGAAACAGTTGACAAGTAT 

1681 ♦ ♦ — ♦ f— ♦ ♦ 1740 

YYOSKDSEYTVAKKETVDKY 
TTTTTATTTATTGAGAGAGCATTAATATTACTCAATCCTACTGGGCTGTTGGGTTATATA 

1741 » ♦ -♦ — - ».. 1800 

FLFIERALILLNPTGLLGYI 
ATACCGCATAAATTCTTTATTACAAAAGGTGGTAAGGAACTAAGAAAGTTCATAGCTGAA 

1801 ♦ * * ♦ » ♦ i860 

IPHKFFITKGGKELRKFIAE 
AMWTCAAATATCAAAAAnATAAATTTTGGTGnACAWGGTCTTTCCAGGAAGAGCG 

1861 t ♦ ♦ ♦ ♦ * 1920 

KHOISKIINFGVfOVFPGRA 
ACATATACGGCTATTTTAATTATCCAAGCAAATAAAATGGCACAGTTCAAGTATAAGAAA 

1921 * * ^ t f 1980 

TYTAIIIIOANKMAQFKYKK 
GTAAGTAATATATCAGCAGAAACCCTAGATTCTGAAGAAAATACGTGTGTTTATAGCTCA 

1981 ♦ ♦ ♦ ♦ » ♦ 2040 

VSNISAETLDSEENTCVYSS 
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GAAAAGTATAATTCTGACCCTTGGATATTTTTATCTCCTGAAACAGAAGCTGTTTTTACT 

2041 ♦ 4 ♦ ♦ ♦ ♦ 2100 

EKYNSOPVIFLSPETEAVFT 
AAATTTACAGAAGCTCAATTTGAGAAACTTG6AGAAATCACTGATATAAGTGTAGGACTA 

2101 * 4 4 * ♦ ♦ 21E0 

KFTEAOFEKLGEITDISVGL 
CAAACAAGCGCTWTAAAATATATATTTTTATTCCTGAAAATGAAACTTCA6ATACATAT 

2161 4 -4 * 4 4 ♦ 2220 

OTSAOKIYIFIPEMETSOTY 
ATATTTAATTATAAAGGGAAAAGATATGAAATAGAAAAATCTATATGTTGCCCAGCTATC 

2221 4 4 ♦ ♦ ♦ » 2280 

IFNYKGKRYEIEKSICCPAI 
TATGACTTATCTTTTGGTTCTTTTGAAAGCATTCAGGGAAATGCACAAATGATATTCCCT 

2281 4 4 ♦ ♦ . ♦ 2340 

YOLSFGSFESIQGNAQHIFP 
TATGAAATCAGAGATGMGAAGCATATCTACTAGAGGAAGAMCGCTTGAAAATGATTAT 

2341 4.— ♦ ♦ ♦ , ♦ 2400 

Y E I R D E E A Y L L E E E T L E N D Y 
CCTCTTGCTTGGAATTATTTGAATGAGTTTAAAGAAGCTCTTGAAAAAAGAAGCTTACAA 

2401 4 ♦ —4 -4 » 24GO 

PLAWNYLNEFKEALEKRSLQ 
GGCCGTAATCCGAAATGGTAT(^TATGGTCGGTCCCAAAGTTTATCAAAATTTCATGAT 

24G1 - 4 ♦ ♦ ♦ ♦ ♦ 2520 

GRNPKWYOYGRSQSLSKFHD 
AAAGAAAAACTGATATGGACCGTACTTGCTAC6AAACCCCCGTATGTACTTGATAGGAAT 

2521 ♦ ♦ 4 4 » 4 2580 

KEKIIVTVLATKPPYVIDRN 
AACCTGTTATTTACTGGTGGTGGAAACGGACCGTATTATGGTTTAArTAACCAATCTATT 

2581 4 t , f 2G40 

NLLFTGGGNGPYYGLINOSI 
TACTCTTTGCATTATTTTTTAGGTATTCTTTCACATCCTGTAATAGAAAGTATGGTAAAA 

2G41 4 ♦ —4 , ♦ 4 2700 

YSLHYFLGILSHPVIESHVK 
GCAAGGGCCAGTGAATnAGGGGATCATATTATTCTCATGGAAAACAATTTATTGAGAAA 

2701 — 4 4 4 4 — — 4 2760 

ARASEFRGSYYSHGKQFIEK 
ATCCCAATTAGAAAGAnGATTTTGATGATCAAGATGAGGTAGACAAATATAATACGGTG 

2761 4 4 4 4 4 4 2820 

IPIRKIOFOOQDEVDKYNTY 
GTCACAACAGTAGAAAMnMlTATAACTACCGATAGMTTAAAAGTGAGAGCAATGGA 

2821 4 , ♦ 4 4 * 28B0 

VrTVEKLIITTDRIKSESNG 
CCCCGGAGGAGAATGTTAAGAAGAAGGTTAGAT6CTTTGTCTAATCAACTTATCCA6GTT 

2BB1 4 4 » 4 4 4 2940 

PRRRHLRRRLOALSNOLIOV 
ATTAATGAACTTTATAATATCAGTGACGAAGAATATACGACAGTTTrGAATGATGAAATG 

2941 4 ♦ ♦ 4 4 3000 

INELYNISDEEYTTYLNOEH 
TTGACAGCGGCGTTAGGAGAAGAAAAATGA 

3001 4 « ♦ 3030 

LTAALGEEKi 
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FIG. 4 
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Bpil ~~ HENRDI SBWROSE FRACTIONS 
WITH RECONBDMNT Bpal 
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